@@include("_includes/header.html", {
    "title": "Regular expression help and examples for grepWin",
    "metaKeywords": "<meta name=\"keywords\" content=\"grepWin,regex,tutorial,examples\">"
})

<div class="wrapper">
<div class="content">
     <h2>Regular expression help and examples for grepWin</h2>

     <p>
         grepWin uses the <a href="http://www.boost.org">boost</a> regex engine to do its work,
         with the <a href="http://www.boost.org/doc/libs/1_57_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html">Perl Regular Expression Syntax</a>.
     </p>

     <h3 id="intro">Introduction</h3>
     <p>
         I'll only explain the very basics on how to use regular
         expressions and some special variables you can use
         in grepWin that aren't part of the official regular
         expression syntax.
     </p>
     <p>
         For a much more detailed tutorial on regular expressions,
         please go to <a href="http://www.regular-expressions.info/">this site</a>
         - it also explains a lot on how regex engines work internally.
     </p>

     <h3 id="basics">search basics</h3>
     <dl>
         <dt>. (dot)</dt>
         <dd>
         a dot matches any character. Searching for <code>t.t</code>
         will match <code>tat</code> as well as <code>tut</code>.
         </dd>
         <dt>+</dt>
         <dd>
         matches the previous expression one or more times, but
         at least once. Searching for <code>spel+ing</code> will
         find all words like <code>speling</code> or <code>spelling</code>
         but not <code>speing</code> since the <code>l</code>
         must be matched at least once.
         </dd>
         <dt>*</dt>
         <dd>
         matches the previous expression zero or more times.
         Searching for <code>spel*ing</code> will
         find all words like <code>speling</code> or <code>spelling</code>
         and also <code>speing</code> since the <code>l</code>
         can be matched zero times, which means it doesn't have
         to be there.
         </dd>
         <dt>\</dt>
         <dd>
         the backslash escapes special characters that would
         otherwise be treated specially. Searching for a double
         dot in your text with <code>..</code> would not work
         since the dot matches any character. To search for
         a double dot you have to escape the dot chars like this:
         <code>\.\.</code>.
         </dd>
         <dt>\Q..\E</dt>
         <dd>
         in case you need to search for a literal string that
         has a lot of special characters in it, you can
         use the <code>\Q..\E</code> sequence. Searching for
         <code>*.*</code> would match everything unless you
         escape every single char like this: <code>\*\.\*</code>.
         For such search strings it's easier to just put
         them inside the <code>\Q..\E</code> sequence like this:
         <code>\Q*.*\E</code>.
         </dd>
         <dt>[]</dt>
         <dd>
         With square brackets you can specify so called character
         classes. Such a class matches all chars that are
         specified between the brackets. Searching for <code>[-+0-9]+</code>
         will find any string that contains the chars '-', '+' and
         all chars between 0 and 9, but no other chars. It will match
         <code>-123</code>, <code>+123</code> or <code>123</code>,
         but not <code>testword</code>. There are a few
         default character classes defined so you don't have
         to create one yourself. You can find a list of those
         classes <a href="http://www.boost.org/doc/libs/1_57_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html#boost_regex.syntax.perl_syntax._quot_single_character_quot__character_classes_">here</a>.
         The most used ones are <code>\d</code> which matches all digits,
         <code>\w</code> which matches all word chars and
         <code>\s</code> which matches all whitespace chars.
         </dd>
         <dt>^, $</dt>
         <dd>
         the caret matches the beginning of a line, and the string
         char <code>$</code> matches the end of a line.
         Searching for <code>^title$</code> will only find
         lines that only consist of the word <code>title</code>,
         but no places where the word <code>title</code> is inside
         a line. Searching for <code>^//</code> will find all lines
         that start with two slashes, but not lines where two
         slashes are not at the very beginning of a line.
         Searching for <code>goodbye\.$</code> will find lines
         that end with <code>goodbye.</code>, but not if <code>goodbye.</code>
         is somewhere inside a line.
         </dd>
         <dt>\b</dt>
         <dd>
         <code>\b</code> matches word boundaries. Searching for
         <code>\bword\b</code> finds <code>word</code>, but not
         <code>subwords</code> or <code>words</code>.
         </dd>
         <dt>()</dt>
         <dd>
         parenthesis pairs define a group. Grouping is useful
         for more advanced regex searching, but also for use
         when replacing text. Each group that matches part of
         the full matching string can be referenced later in
         the replace string.
         </dd>
         <dt>|</dt>
         <dd>
         The <code>|</code> char is used as an OR operator.
         Searching for <code>cat|dog</code> will match either
         <code>cat</code> or <code>dog</code>. Note that the OR
         operator uses everything left and right of the operator.
         If you want to limit the reach of the operator, you have
         to use brackets to group them. Searching for <code>(cat|dog)food</code>
         finds <code>catfood</code> and <code>dogfood</code>.
         </dd>
     </dl>

     <h3 id="replacing">replacing</h3>
     <p>
         replacing strings is not more complicated than searching.
         Whatever the search finds is replaced with the replace
         string. Searching for <code>cat</code> and replacing
         it with <code>dog</code> is the most basic example and
         works just like you'd expect.
     </p>
     <p>
         in replace strings, you can also use references.
     </p>
     <dl>
         <dt>$1..$9</dt>
         <dd>
         in replace strings, you can also refer to matched groups
         from the search string. Groups are referred to with <code>$1..$9</code>.
         For example, if you search for <code>(cats) and (dogs)</code>
         and replace it with <code>$2 and $1</code>, the string
         <code>cats and dogs</code> gets replaced with <code>dogs and cats</code>.
         <code>$1</code> refers to the first matching group, which is <code>cats</code>,
         and <code>$2</code> refers to the second matching group,
         which is <code>dogs</code>.
         </dd>
         <dt>${filepath}, ${filename}, ${fileext}</dt>
         <dd>
         the <code>${filepath}</code> reference gets replaced with the
         full path of the current file. <code>${filename}</code>
         gets replaced with the filename without the file extension,
         and <code>${fileext}</code> gets replaced with the file
         extension of the current file. This is special to grepWin.
         </dd>
         <dt>${count0N}, ${count0N(AA)}, ${count0N(AA,BB)}</dt>
         <dd>
         grepWin also offers a special replace reference for
         counting. <code>${count0N}</code> is replaced with
         numbers starting from 1 and incremented by 1. The <code>0</code>
         and <code>N</code> are optional and used for formatting
         the number. The <code>N</code> is a number that specifies
         how many chars the number should use. The number is then
         padded with spaces to fill the space. If <code>0</code>
         is specified, the number is padded with leading zeros.
         You can also specify the start count using the <code>AA</code>
         number, and the increment values using the <code>BB</code>
         number for the counting.
         </dd>
     </dl>



     <h3 id="examples">replacing examples</h3>
     <dl>
         <dt>insert line numbers at the start of each line</dt>
         <dd>
         <p>Search string: <code>^</code></p>
         <p>Replace string: <code>${count04}</code></p>
         Results in:
<pre>
0001 line 1
0002 line 2
0003 line 3
</pre>
        </dd>
    </dl>
</div>
</div>

@@include("_includes/footer.html")
